Discriminative training of GMM-HMM acoustic model by RPCL learning
نویسندگان
چکیده
This paper presents a new discriminative approach for training Gaussian mixture models (GMMs) of hidden Markov models (HMMs) based acoustic model in a large vocabulary continuous speech recognition (LVCSR) system. This approach is featured by embedding a rival penalized competitive learning (RPCL) mechanism on the level of hidden Markov states. For every input, the correct identity state, called winner and obtained by the Viterbi force alignment, is enhanced to describe this input while its most competitive rival is penalized by de-learning, which makes GMMs-based states become more discriminative.Without the extensive computing burden required by typical discriminative learning methods for one-pass recognition of the training set, the new approach saves computing costs considerably. Experiments show that the proposed method has a good convergence with better performances than the classical maximum likelihood estimation (MLE) based method. Comparing with two conventional discriminative methods, the proposed method demonstrates improved generalization ability, especially when the test set is not well matched with the training set.
منابع مشابه
Discriminative training of GMM-HMM acoustic model by RPCL type Bayesian Ying-Yang harmony learning
This paper presents a new discriminative approach for training Gaussian mixture models (GMMs) of hidden Markov models (HMMs) based acoustic model in a large vocabulary continuous speech recognition (LVCSR) system. This approach is featured by embedding a rival penalized competitive learning (RPCL) mechanism on the level of hidden Markov states. For every input, the correct identity state, calle...
متن کاملA scalable approach to using DNN-derived features in GMM-HMM based acoustic modeling for LVCSR
We present a new scalable approach to using deep neural network (DNN) derived features in Gaussian mixture density hidden Markov model (GMM-HMM) based acoustic modeling for large vocabulary continuous speech recognition (LVCSR). The DNN-based feature extractor is trained from a subset of training data to mitigate the scalability issue of DNN training, while GMM-HMMs are trained by using state-o...
متن کاملExploiting foreign resources for DNN-based ASR
Manual transcription of audio databases for the development of automatic speech recognition (ASR) systems is a costly and time-consuming process. In the context of deriving acoustic models adapted to a specific application, or in low-resource scenarios, it is therefore essential to explore alternatives capable of improving speech recognition results. In this paper, we investigate the relevance ...
متن کاملA Comparative Study of RPCL and MCE Based Discriminative Training Methods for LVCSR
This paper presents a comparative study of two discriminative methods, i.e., Rival Penalized Competitive Learning (RPCL) and Minimum Classification Error (MCE), for the tasks of Large Vocabulary Continuous Speech Recognition (LVCSR). MCE aims at minimizing a smoothed sentence error on training data, while RPCL focuses on avoiding misclassification through enforcing the learning of correct class...
متن کاملGMM-Free Flat Start Sequence-Discriminative DNN Training
Recently, attempts have been made to remove Gaussian mixture models (GMM) from the training process of deep neural network-based hidden Markov models (HMM/DNN). For the GMM-free training of a HMM/DNN hybrid we have to solve two problems, namely the initial alignment of the frame-level state labels and the creation of context-dependent states. Although flat-start training via iteratively realign...
متن کامل